home *** CD-ROM | disk | FTP | other *** search
-
-
-
- AWK User Commands AWK
-
-
-
- NNAAMMEE
- awk - pattern-directed scanning and processing language
-
- SSYYNNOOPPSSIISS
- aawwkk [ --FF_f_s ] [ prog ] [ file ... ]
-
- DDEESSCCRRIIPPTTIIOONN
- _A_w_k scans each input _f_i_l_e for lines that match any of a set
- of patterns specified literally in _p_r_o_g or in a file speci-
- fied as --ff _f_i_l_e. With each pattern there can be an associ-
- ated action that will be performed when a line of a _f_i_l_e
- matches the pattern. Each line is matched against the pat-
- tern portion of every pattern-action statement; the associ-
- ated action is performed for each matched pattern. The file
- name means the standard input. Any _f_i_l_e of the form
- _v_a_r=_v_a_l_u_e is treated as an assignment, not a filename.
-
- An input line is made up of fields separated by white space,
- or by regular expression FFSS. The fields are denoted $$11, $$22,
- ...; $$00 refers to the entire line.
-
- A pattern-action statement has the form
-
- _p_a_t_t_e_r_n {{ _a_c_t_i_o_n }}
-
- A missing {{ _a_c_t_i_o_n }} means print the line; a missing pattern
- always matches. Pattern-action statements are separated by
- newlines or semicolons.
-
- An action is a sequence of statements. A statement can be
- one of the following:
-
- iiff(( _e_x_p_r_e_s_s_i_o_n )) _s_t_a_t_e_m_e_n_t [ eellssee _s_t_a_t_e_m_e_n_t ] wwhhiillee((
- _e_x_p_r_e_s_s_i_o_n )) _s_t_a_t_e_m_e_n_t ffoorr(( _e_x_p_r_e_s_s_i_o_n ;; _e_x_p_r_e_s_s_i_o_n ;;
- _e_x_p_r_e_s_s_i_o_n )) _s_t_a_t_e_m_e_n_t ffoorr(( _v_a_r iinn _a_r_r_a_y )) _s_t_a_t_e_m_e_n_t ddoo
- _s_t_a_t_e_m_e_n_t wwhhiillee(( _e_x_p_r_e_s_s_i_o_n )) bbrreeaakk ccoonnttiinnuuee {{ [ _s_t_a_t_e_-
- _m_e_n_t ... ] }} _e_x_p_r_e_s_s_i_o_n ## commonly _v_a_r =
- _e_x_p_r_e_s_s_i_o_n pprriinntt [ _e_x_p_r_e_s_s_i_o_n-_l_i_s_t ] [ >> _e_x_p_r_e_s_s_i_o_n ]
- pprriinnttff _f_o_r_m_a_t [ ,, _e_x_p_r_e_s_s_i_o_n-_l_i_s_t ] [ >> _e_x_p_r_e_s_s_i_o_n ]
- rreettuurrnn [ _e_x_p_r_e_s_s_i_o_n ] nneexxtt ## skip
- remaining patterns on this input line ddeelleettee _a_r_r_a_y[[
- _e_x_p_r_e_s_s_i_o_n ]]## delete an array element eexxiitt [ _e_x_p_r_e_s_s_i_o_n
- ] ## exit immediately; status is _e_x_p_r_e_s_s_i_o_n
-
- Statements are terminated by semicolons, newlines or right
- braces. An empty _e_x_p_r_e_s_s_i_o_n-_l_i_s_t stands for $$00. String
- constants are quoted " ", with the usual C escapes recog-
- nized within. Expressions take on string or numeric values
- as appropriate, and are built using the operators ++ -- ** // %%
- ^^ (exponentiation), and concatenation (indicated by a
- blank). The operators ++++ ---- ++== --== **== //== %%== ^^== ****== >> >>== << <<==
- ==== !!== ??:: are also available in expressions. Variables may
-
-
-
- AT&T UNIX System Toolchest7 December 1987 1
-
-
-
-
-
-
- AWK User Commands AWK
-
-
-
- be scalars, array elements (denoted _x[[_i]]) or fields. Vari-
- ables are initialized to the null string. Array subscripts
- may be any string, not necessarily numeric; this allows for
- a form of associative memory. Multiple subscripts such as
- [[ii,,jj,,kk]] are permitted; the constituents are concatenated,
- separated by the value of SSUUBBSSEEPP.
-
- The pprriinntt statement prints its arguments on the standard
- output (or on a file if >>_f_i_l_e or >>>>_f_i_l_e is present or on a
- pipe if ||_c_m_d is present), separated by the current output
- field separator, and terminated by the output record separa-
- tor. _f_i_l_e and _c_m_d may be literal names or parenthesized
- expressions; identical string values in different statements
- denote the same open file. The pprriinnttff statement formats its
- expression list according to the format (see _p_r_i_n_t_f(3)).
- The built-in function cclloossee((_e_x_p_r)) closes the file or pipe
- _e_x_p_r.
-
- The customary functions eexxpp, lloogg, ssqqrrtt, ssiinn, ccooss, aattaann22 are
- built in. Other built-in functions:
-
- lleennggtthh
- the length of its argument taken as a string, or of $$00
- if no argument.
-
- rraanndd random number on (0,1)
-
- ssrraanndd
- sets seed for rraanndd
-
- iinntt truncates to an integer value
-
- ssuubbssttrr((_s,, _m,, _n))
- the _n-character substring of _s that begins at position
- _m counted from 1.
-
- iinnddeexx((_s,, _t))
- the position in _s where the string _t occurs, or 0 if it
- does not.
-
- mmaattcchh((_s,, _r))
- the position in _s where the regular expression _r
- occurs, or 0 if it does not. The variables RRSSTTAARRTT and
- RRLLEENNGGTTHH are set to the position and length of the
- matched string.
-
- sspplliitt((_s,, _a,, _f_s))
- splits the string _s into array elements _a[[11]], _a[[22]],
- ..., _a[[_n]], and returns _n. The separation is done with
- the regular expression _f_s or with the field separator
- FFSS if _f_s is not given.
-
-
-
-
- AT&T UNIX System Toolchest7 December 1987 2
-
-
-
-
-
-
- AWK User Commands AWK
-
-
-
- ssuubb((_r,, _t,, _s))
- substitutes _t for the first occurrence of the regular
- expression _r in the string _s. If _s is not given, $$00 is
- used.
-
- ggssuubb same as ssuubb except that all occurrences of the regular
- expression are replaced; ssuubb and ggssuubb return the number
- of replacements.
-
- sspprriinnttff((_f_m_t,, _e_x_p_r,, ... ))
- the string resulting from formatting _e_x_p_r ... according
- to the _p_r_i_n_t_f(3) format _f_m_t
-
- ssyysstteemm((_c_m_d))
- executes _c_m_d and returns its exit status
-
- The ``function'' ggeettlliinnee sets $$00 ttoo the next input record
- from the current input file; ggeettlliinnee <<_f_i_l_e sets $$00 to the
- next record from _f_i_l_e. ggeettlliinnee _x sets variable _x instead.
- Finally, _c_m_d || ggeettlliinnee pipes the output of _c_m_d into ggeettlliinnee;
- each call of ggeettlliinnee returns the next line of output from
- _c_m_d. In all cases, ggeettlliinnee returns 1 for a successful
- input, 0 for end of file, and -1 for an error.
-
- Patterns are arbitrary Boolean combinations (with !! |||| &&&&)
- of regular expressions and relational expressions. Regular
- expressions are as in _e_g_r_e_p(1). Isolated regular expres-
- sions in a pattern apply to the entire line. Regular
- expressions may also occur in relational expressions, using
- the operators ~~ and !!~~. //_r_e// is a constant regular expres-
- sion; any string (constant or variable) may be used as a
- regular expression, except in the position of an isolated
- regular expression in a pattern.
-
- A pattern may consist of two patterns separated by a comma;
- in this case, the action is performed for all lines from an
- occurrence of the first pattern though an occurrence of the
- second.
-
- A relational expression is one of the following:
-
- _e_x_p_r_e_s_s_i_o_n _m_a_t_c_h_o_p _r_e_g_u_l_a_r-_e_x_p_r_e_s_s_i_o_n
- _e_x_p_r_e_s_s_i_o_n _r_e_l_o_p _e_x_p_r_e_s_s_i_o_n
-
- where a relop is any of the six relational operators in C,
- and a matchop is either ~~ (matches) or !!~~ (does not match).
- A conditional is an arithmetic expression, a relational
- expression, or a Boolean combination of these.
-
- The special patterns BBEEGGIINN and EENNDD may be used to capture
- control before the first input line is read and after the
- last. BBEEGGIINN and EENNDD do not combine with other patterns.
-
-
-
- AT&T UNIX System Toolchest7 December 1987 3
-
-
-
-
-
-
- AWK User Commands AWK
-
-
-
- Variable names with special meanings:
-
- FFSS regular expression used to separate fields; also sett-
- able by option --FF_f_s.
-
- NNFF number of fields in the current record
-
- NNRR ordinal number of the current record
-
- FFNNRR ordinal number of the current record in the current
- file
-
- FFIILLEENNAAMMEE
- the name of the current input file
-
- RRSS input record separator (default newline)
-
- OOFFSS output field separator (default blank)
-
- OORRSS output record separator (default newline)
-
- OOFFMMTT output format for numbers (default %%..66gg)
-
- SSUUBBSSEEPP
- separates multiple subscripts (default 034)
-
- AARRGGCC argument count, assignable
-
- AARRGGVV argument array, assignable; non-null members are taken
- as filenames
-
- Functions may be defined (at the position of a pattern-
- action statement) thus:
-
- function foo(a, b, c) { ...; return x }
-
- Parameters are passed by value if scalar and by reference if
- array name; functions may be called recursively. Parameters
- are local to the function; all other variables are global.
-
- EEXXAAMMPPLLEESS
- length > 72
-
- Print lines longer than 72 characters.
-
- { print $2, $1 }
-
- Print first two fields in opposite order.
-
- BEGIN { FS = ",[ \t]*|[ \t]+" }
- { print $2, $1 }
-
-
-
-
- AT&T UNIX System Toolchest7 December 1987 4
-
-
-
-
-
-
- AWK User Commands AWK
-
-
-
- Same, with input fields separated by comma and/or
- blanks and tabs.
-
- { s += $1 }
- END { print "sum is", s, " average is", s/NR }
-
- Add up first column, print sum and average.
-
- /start/, /stop/
-
- Print all lines between start/stop pairs.
-
- BEGIN { # Simulate echo(1)
- for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
- printf "\n"
- exit }
-
- SSEEEE AALLSSOO
- lex(1), sed(1)
- A. V. Aho, B. W. Kernighan, P. J. Weinberger, _A_w_k - _a _P_a_t_-
- _t_e_r_n _S_c_a_n_n_i_n_g _a_n_d _P_r_o_c_e_s_s_i_n_g _L_a_n_g_u_a_g_e: _U_s_e_r'_s _M_a_n_u_a_l
-
- BBUUGGSS
- There are no explicit conversions between numbers and
- strings. To force an expression to be treated as a number
- add 0 to it; to force it to be treated as a string concaten-
- ate "" to it.
- The scope rules for variables in functions are a botch.
- The (undocumented) options --SS and --RR are flaky.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- AT&T UNIX System Toolchest7 December 1987 5
-
-
-
-